## OVERVIEW

this folder contains the replication materials for 'Non-separable Preferences in the Statistical Analysis of Roll Call Votes' by garret binding and lukas f. stoetzer.

below, we give more information on the R scripts and data provided in these materials, the R packages and their versions used in this analysis, and the mapping of plots from their filenames to figures in the paper.

## R SCRIPTS

master_*.R is a wrapper function that calls all individual scripts in the correct order for the three applications - monte carlo simulations (_mc), us senate (_us), european parliament (_ep) - so that no further manual intervention is necessary.

00_pckgs.R installs the necessary packages for the execution of the subsequent scripts.

for the three applications, four scripts each exist that all do very similar things and are executed sequentially:

- 01: set up data for later access. in mc, data is artificially generated. in us, data is downloaded from rvoteview (https://github.com/voteview/Rvoteview). in ep, data from hix, noury, and roland (2006; https://personal.lse.ac.uk/hix/hixnouryrolandepdata.htm) is downloaded. data is saved to data/.

- 02: run models in cmdstanr. this is always first done for the separable specification and then the non-separable specification. the two stan models called by cmdstanr are saved as mod_mc.stan (used for mc simulations with no temporal component) and mod_dyn.stan (used for us and ep including lags of t-1 as random effects per legislator and dimension). for mc, the script needs to be rerun 10 times because simulations are broken down into blocks of 100. for us and ep, models can be rerun in a second round using initial values based on estimates from the first round if models don't converge the first time round (which is the rare expection). results are written out to results/.

- 03: raw results from cmdstanr are reorganized to cleaner dataframes. these dataframes are written out to results/ and provided in these replication materials.

- 04: results are visualized using ggplot2.

overall, runtime of the 02 step is roughly 102 hrs for mc, roughly 120 hrs for ep, and roughly 75 hrs for us. this is based on a pc running on ubuntu 20.04 with 16 cpu and 128 gb ram.

## R DATA

the cleaned dataframes containing the results from the 02-step discussed above are saved to results/mc/full.Rda, results/ep/full.Rda, and results/us/full.Rda. users who wish to visualize the results (i.e., the 04-step discussed above) can use the datasets saved in results/ in these replication materials to skip the 02-step which is computationally expensive and takes long.

## R PACKAGES

the r packages used to run this analysis are specified below.

# sessionInfo()
# R version 4.1.1 (2021-08-10)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 20.04.3 LTS
# 
# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
# LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
# 
# locale:
#   [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
# [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
# 
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] ggrepel_0.9.1   readxl_1.3.1    Rvoteview_0.1   abind_1.4-5     posterior_1.1.0 bayesplot_1.8.1 cmdstanr_0.4.0  xml2_1.3.2      rvest_1.0.2     devtools_2.4.2  usethis_2.1.0   forcats_0.5.1   stringr_1.4.0  
# [14] purrr_0.3.4     readr_2.0.2     tibble_3.1.5    ggplot2_3.3.5   tidyverse_1.3.1 loo_2.4.1       tidyr_1.1.4     dplyr_1.0.7    
# 
# loaded via a namespace (and not attached):
#   [1] matrixStats_0.61.0   fs_1.5.0             lubridate_1.8.0      httr_1.4.2           rprojroot_2.0.2      rstan_2.21.2         tensorA_0.36.2       tools_4.1.1          backports_1.2.1      utf8_1.2.2          
# [11] R6_2.5.1             DBI_1.1.1            colorspace_2.0-2     withr_2.4.2          tidyselect_1.1.1     gridExtra_2.3        prettyunits_1.1.1    processx_3.5.2       curl_4.3.2           compiler_4.1.1      
# [21] cli_3.0.1            desc_1.4.0           scales_1.1.1         checkmate_2.0.0      ggridges_0.5.3       callr_3.7.0          StanHeaders_2.21.0-7 pscl_1.5.5           pkgconfig_2.0.3      sessioninfo_1.1.1   
# [31] dbplyr_2.1.1         fastmap_1.1.0        rlang_0.4.11         rstudioapi_0.13      generics_0.1.0       farver_2.1.0         jsonlite_1.7.2       distributional_0.2.2 inline_0.3.19        magrittr_2.0.1      
# [41] Rcpp_1.0.7           munsell_0.5.0        fansi_0.5.0          lifecycle_1.0.1      stringi_1.7.5        MASS_7.3-54          pkgbuild_1.2.0       plyr_1.8.6           grid_4.1.1           parallel_4.1.1      
# [51] crayon_1.4.1         haven_2.4.3          hms_1.1.1            knitr_1.36           ps_1.6.0             pillar_1.6.3         codetools_0.2-18     stats4_4.1.1         pkgload_1.2.3        fastmatch_1.1-3     
# [61] reprex_2.0.1         glue_1.4.2           V8_3.4.2             remotes_2.4.1        RcppParallel_5.1.4   modelr_0.1.8         vctrs_0.3.8          tzdb_0.1.2           testthat_3.1.0       cellranger_1.1.0    
# [71] gtable_0.3.0         assertthat_0.2.1     cachem_1.0.6         xfun_0.26            broom_0.7.9          rstiefel_1.0.1       memoise_2.0.0        ellipsis_0.3.2      

## MAPPING PLOTS

we provide the mapping from the files generated by this code to the figures included in the article and the supplementary information in this section. the plots are provided as plots/ in the replication materials.

main article:
plots/out/indifference_sep.pdf + plots/out/indifference_sub.pdf -> figure 1
plots/mc/sals_e_cmbn.pdf -> figure 2
plots/mc/cor_cmbn.pdf -> figure 3a
plots/mc/cor_dim_cmbn.pdf -> figure 3b
plots/us/nsep_e.pdf -> figure 4a
plots/us/sal_e.pdf -> figure 4b
plots/us/cor_e.pdf -> figure 5
plots/us/move_2.pdf -> figure 6
plots/ep/nsep_e.pdf -> figure 7

supplementary information - mc section:
plots/mc/sals_e_sal.pdf -> figure 1a
plots/mc/sals_e_cor.pdf -> figure 1b
plots/mc/nsep_e.pdf -> figure 2a
plots/mc/nsep_e_sal.pdf -> figure 2b
plots/mc/nsep_e_cor.pdf -> figure 2c
plots/mc/cor_cmbn_sal.pdf -> figure 3a
plots/mc/cor_cmbn_cor.pdf -> figure 3b
plots/mc/rsq_e.pdf -> figure 4a
plots/mc/rsq_e_sal.pdf -> figure 4b
plots/mc/rsq_e_cor.pdf -> figure 4c
plots/mc/gamma_e.pdf -> figure 5a
plots/mc/gamma_e_sal.pdf -> figure 5b
plots/mc/gamma_e_cor.pdf -> figure 5c
plots/mc/cor_dim_sal.pdf -> figure 6a
plots/mc/cor_dim_cor.pdf -> figure 6b
plots/mc/loo_cmbn_1.pdf -> figure 7a
plots/mc/loo_cmbn_sh.pdf -> figure 7b
plots/mc/pred.pdf -> figure 8
plots/mc/compl.pdf -> figure 9

supplementary information - us section:
plots/us/cor_spec_e.pdf -> figure 10
plots/us/rsq_e.pdf -> figure 11
plots/us/coef_e.pdf -> figure 12
plots/us/corold_e.pdf -> figure 13
plots/us/ideal_nsep_e.pdf -> figure 14
plots/us/ideal_sep_e.pdf -> figure 15
plots/us/dist2.pdf -> figure 16
plots/us/dist.pdf -> figure 17
plots/us/loo.pdf -> figure 18
plots/us/pred.pdf -> figure 19
plots/us/compl.pdf -> figure 20

supplementary information - ep section:
plots/ep/sal_e.pdf -> figure 21
plots/ep/cor_spec_e.pdf -> figure 22
plots/ep/corold_e.pdf -> figure 23
plots/ep/cor_e.pdf -> figure 24
plots/ep/rsq_e.pdf -> figure 25
plots/ep/coef_e.pdf -> figure 26
plots/ep/ideal_nsep_e.pdf -> figure 27
plots/ep/ideal_sep_e.pdf -> figure 28
plots/ep/dist2.pdf -> figure 29
plots/ep/dist.pdf -> figure 30
plots/ep/loo.pdf -> figure 31
plots/ep/pred.pdf -> figure 32
plots/ep/compl.pdf -> figure 33
